-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix astype conversion string -> float #37974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add a whatsnew note as well (I don't think this actually closes the original as you mentioned many cases, but that's ok)
@@ -226,6 +226,14 @@ def test_astype_int(): | |||
tm.assert_extension_array_equal(result, expected) | |||
|
|||
|
|||
def test_astype_float(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use the nulls_fixture (though may have to skip for pd.NaT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand what for? IIUC pd.Series(["1.1", null], dtype="string")
should be equal for null
in pd.NA
, np.nan
, pd.NaT
(works), None
. This behaviour should be tested somewhere else and adding the fixture would just result in repeating the same test four times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Please see the comments below.
Would you consider fixing the similar issue with astype("int")
conversion as well in a separate PR?
Sure.
Can you expand on the issue and, if not obvious, with the expected behaviour? |
Similar to float.
|
Hmm. As far as I understood,
However, using the nullable extension array dtype
Would you prefer if above ( Edit: Were you possibly referring to Unrelated: Any idea what's up with the failing tests? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping on green
@jreback Could you have another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. @jorisvandenbossche if any comments.
thanks @mlondschien |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
Fixes the
string
->float
conversion as discussed in #37626 in presence ofpd.NA
. I used the same approach as used instring
->Int64
. Note that differently to theInt64
case, we return anumpy.ndarray
, not apd.array
.I added a test that compares
pd.Series
. Replacing this withpd.array
thetm.assert_numpy_array_equal
assertion will raise, asexpected
is of typeFloatingArray
. IIUC this is a new feature of pandas 1.2.0. Should we return this in.astype("float")
by default?